
Professor of Urban Analytics & Current Head of Department @ Bartlett Centre for Advanced Spatial Analysis (CASA), UCL
Geographer by background - ex-Secondary School Teacher - back in HE for 15+ years
Taught GIS / Spatial Data Science at postgrad level for last 10 years
Whistle-stop tour of some of the key concepts relating to spatial data
Some examples of how carrying out analyses of spatial data can require additional attention - “spatial is special”
Everything happens somewhere

More reliable than names (that are rarely unique or reference fuzzy locations), are coordinates
The earth is roughly spherical and points anywhere on its surface can be described using the World Geodetic System (WGS) - a geographic (spherical) coordinate system
Points can be referenced according to their position on a grid of latitudes (degrees north or south of the equator) and longitudes (degrees east or west of the Prime - Greenwich - meridian)
The last major revision of the World Geodetic System was in 1984 and WGS84 is still used today as the standard system for references places on the globe.
Where? Coordinate Reference Systems
Projected Coordinate Reference Systems convert the 3D globe to a 2D plane and can do so in a huge variety of different ways
Most national mapping agencies have their own projected coordinate systems - in Britain the Ordnance Survey maintain the British National Grid which locates places according to 6-digit Easting and Northing coordinates
Every coordinate system can be referenced by its EPSG code, e.g. WGS84 = 4326 or British National Grid = 27700 with mathematical transformations to convert between them

Once we have a coordinate reference system we can locate objects accurately in space
Most objects that spatial data scientists are concerned with (apart from gridded representations, which we will ignore for now!) can be simplified to either a point, a line or a polygon in that space
Polygons and lines are just multiple point coordinates joined together!
“Everything is related to everything else, but near things are more related than distant things.”

This observation underpins much of what spatial data scientists do
Being able to locate something in space allows us to:
explain why something may be occurring where it is
make better predictions about nearby or further away things
Underpins the whole Geodeomographics (customer segmentation) industry!!

Near and distant can mean different things in different contexts
In spatial data science one way of separating near from distant can simply be to define their topological relationship
The Dimensionally Extended 9-Intersection Model (DE-9IM) is the standard topological model used in GIS to define spatial relationships between objects

If we measure the distance from the centre (centroid) of one ward to another, then we might decide that the 1st, 2nd, 3rd, kth. closest wards are near, the others are far.
We can then decide to include the “k” nearest neighbours or exclude the rest




Where in London do students do best and worst at school?
Is there any pattern? Do better scores and worse scores appear to be clustered? How can we tell?
Spatial Autocorrelation
Spatial Autocorrelation is the phenomenon of near things being more similar than distant things. Do neighbouring wards have more similar GCSE points scores than distant wards?
We can test for spatial autocorrelation by comparing the GCSE Scores in any given ward with the GCSE scores in neighbouring wards (however we choose to define our neighbours - k-nearest, those that are contiguous etc.)
(Intercept) slag_gcse
4.3340700 0.9863434

Moran’s I
Moran I test under randomisation
data: LondonWardsMerged$average_gcse_capped_point_scores_2014
weights: Lward.lw
Moran I statistic standard deviate = 17.658, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.407894262 -0.001602564 0.000537799
